Listen Top Shows Blog

Stop "reinventing" everything to "solve" alignment

Stop "reinventing" everything to "solve" alignment

Update: 2024-04-17

Share

Description

Integrating some non computing science into reinforcement learning from human feedback can give us the models we want.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/reinventing-llm-alignment

0:00 Stop "reinventing" everything to "solve" AI alignment
2:19 Social Choice for AI Alignment: Dealing with Diverse Human Feedback
7:03 OLMo 1.7 7B: A truly open model with actually good benchmarks

Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_013.png
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_015.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_018.png
Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_024.png
Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_027.png

This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Comments

In Channel

Olmo 3: America’s truly open reasoning models

Olmo 3: America’s truly open reasoning models

2025-11-2010:57

Why AI writing is mid

Why AI writing is mid

2025-11-1708:28

Interview: Ant Group's open model ambitions

Interview: Ant Group's open model ambitions

2025-11-1201:17:49

5 Thoughts on Kimi K2 Thinking

5 Thoughts on Kimi K2 Thinking

2025-11-0607:37

Burning out

2025-10-2510:09

How to scale RL

How to scale RL

2025-10-2013:01

The State of Open Models

The State of Open Models

2025-10-1647:04

Thoughts on The Curve

Thoughts on The Curve

2025-10-0711:58

ChatGPT: The Agentic App

ChatGPT: The Agentic App

2025-09-3009:24

Thinking, Searching, and Acting

Thinking, Searching, and Acting

2025-09-2209:22

Coding as the epicenter of AI progress and the path to general agents

Coding as the epicenter of AI progress and the path to general agents

2025-09-1816:18

On China's open source AI trajectory

On China's open source AI trajectory

2025-09-0913:37

Ranking the Chinese Open Model Builders

Ranking the Chinese Open Model Builders

2025-08-1712:41

Contra Dwarkesh on Continual Learning

Contra Dwarkesh on Continual Learning

2025-08-1510:04

GPT-5 and the arc of progress

GPT-5 and the arc of progress

2025-08-0710:41

gpt-oss: OpenAI validates the open ecosystem (finally)

gpt-oss: OpenAI validates the open ecosystem (finally)

2025-08-0513:36

Towards American Truly Open Models: The ATOM Project

Towards American Truly Open Models: The ATOM Project

2025-08-0422:12

Interviewing Ross Taylor on the state of AI: Chinese open models, scaling reasoning, useful tools, and what comes next

Interviewing Ross Taylor on the state of AI: Chinese open models, scaling reasoning, useful tools, and what comes next

2025-07-2901:14:40

The White House's plan for open models & AI research in the U.S.

The White House's plan for open models & AI research in the U.S.

2025-07-2313:10

Kimi K2 and when "DeepSeek Moments" become normal

Kimi K2 and when "DeepSeek Moments" become normal

2025-07-1406:44

00:00

00:00

x

Stop "reinventing" everything to "solve" alignment

Stop "reinventing" everything to "solve" alignment

Nathan Lambert